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The Minimum Weight Steiner Tree (MST) is an important combinatorial optimization problem 
over networks that has applications in a wide range of fields. Here we discuss a general technique 
to translate the imposed global connectivity constrain into many local ones that can be analyzed 
with cavity equation techniques. This approach leads to a new optimization algorithm for MST and 
allows to analyze the statistical mechanics properties of MST on random graphs of various types. 



Given a graph or a lattice, finding a subgraph that opti- 
mizes some global cost function is an important problem 
in many fields. One of the most basic versions of this 
is known as the Minimum Weight Steiner Tree (MST) 
problem. 

Given an undirected graph with positive weights on 
the edges, the MST problem consists in finding a con- 
nected subgraph of minimum weight that contains a se- 
lected set of "terminal" vertices. Such construction may 
require the inclusion of some nonterminal nodes which 
are called Steiner nodes. Clearly, an optimal sub-graph 
must be a tree. Solving MST is a key component of many 
optimization problems involving real networks. Concrete 
examples are network reconstruction in biology (phyloge- 
netic trees and regulatory sub-networks) , Internet multi- 
casting, circuit design and power or water distribution 
networks design, just to mention few famous ones. MST 
is also a beautiful mathematical problem in itself which 
lies at the root of computer science being both NP- 
complete [l| and difficult to approximate [2j|. In physics 
the Steiner tree problem has similarities with many basic 
models such as polymers, self avoiding walks or transport 
networks (e.g. [3|) with a non-trivial interplay between 
local an global frustration. 

Here we show that the cavity approach of statistical 
physics can be used to both analyze and solve this prob- 
lem on random graphs (as e.g. [4J, |?3, |6j) once an ap- 
propriate representation is chosen. We actually study 
the even more general (and eventually harder) I?— MST 
problem in which we consider the depth of the tree from 
a root terminal node to be bounded by D. Unfortu- 
nately the traditional techniques for studying topologi- 
cally connected structures, as for instance the so-called 
0(n) model, are incompatible with the cavity method. 
We provide here instead an arborescent representation of 
the Steiner problem which allows to implement explicitly 
global connectivity constraints in terms of local ones. 

In recent years many algorithmic results have appeared 
showing the efficacy of the cavity approach for optimiza- 
tion and inference problems defined over both sparse and 
dense random networks of constraints [J, la, la, LZI, ls|, IsJ - 
These performances are understood in terms of factoriza- 
tion properties of the Gibbs measure over ground states, 



which can be also seen as the onset of correlation decay 
along the iterations of the cavity equations [lfj|. Here we 
make a step further by presenting evidence for the exact- 
ness of the cavity approach for a qualitatively different 
class of models, namely problems which are subject to 
rigid global constraints that couple all variables. Quite 
often this type of global constraint is of topological origin 
and is common to many problems across disciplines (e.g. 
the Traveling Salesman Problem in computer science or 
Self-Avioding Walks in physics). 

Our work addresses two questions: by analyzing the 
distributional equations we provide the phase diagrams 
of the problem in the control parameters a and D, where 
aN is the number of terminals in a graph of N vertices 
and D is the allowed depth of the tree from a randomly 
chosen root. We compute quantities like the behavior of 
the minimum cost as a function of D for a given frac- 
tion a of terminals, or the number of Steiner nodes cN s 
where both c and the exponent s depend on D and a. 
Such quantities are of extreme interest in that they are 
directly connected with the topology of the tree. For 
instance, for the case of complete graphs with random 
weights we find that an extremely small depth Dm is suf- 
ficient for reaching costs which are close to optimal ones 
for the unbounded trees (e.g. for the complete graph 
with random weights we find that Djy ~ log log N is 
sufficient to reach asymptotically a cost close to the op- 
timal one £(3) [ll|, ll2| of the minimum spanning tree 
which has depth e(iV 1 / 3 ) [13]). For finite D the results 
of the cavity approach can be compared with rigorous 
upper and lower bounds [l8| making us conjecture that 
the cavity approach is exact, as it happens for random 
Matchings [14 ] . Similar results hold for other classes of 
random graphs. Here we give results for fixed degree and 
Scale- Free graphs, for which some non trivial patterns of 
solutions for optimal Steiner trees appear. 

On the algorithmic side, the arborescent representa- 
tion of the problem leads to cavity equations that can be 
turned into an algorithm for solving single instances. 

Very few results are known on the Steiner problem on 
random graphs in the regime in which a is finite. For the 
complete graph with random weights some upper and 
lower bounds for the minimum cost have been derived 



[la ], which are compatible with those predicted by the 
cavity method. For finite degree random graphs (e.g. 
Erdos-Renyi, fixed degree or scale-free graphs) much less 
is known. 

The model. We model the Steiner tree problem as a 
rooted tree (such a construction is often associated with 
the term "arborescence") . Each node i is endowed with a 
pair of variables (j?i,di), a pointer pt to some other node 
in the neighborhood V(i) of i and a depth di G {1, . . . , D} 
defined as the distance from the root. Terminal nodes 
must point to some other node in the final tree and hence 
Pi G V(i). The root node conventionally points to itself . 
Non-root nodes either point to some other node in V(i) 
if they are part of the tree (Steiner and terminal nodes) 
or just do not point to any node if they are not part of 
the tree (allowed only for non-terminals), a fact that we 
represent by allowing for an extra state for the pointer 
Pi G V(i) U 0. The depth of the root is set to zero, di = 
while for the other nodes in the tree the depths measure 
the distance from the root along the unique oriented path 
from the node to the root 

In order to impose the global connectivity constraint 
for the tree we need to impose the condition that if Pi = j 
then pj 7^ and dj = d, — 1. This condition forbids 
loops and guarantees that the pointers describe a tree. 
In building the cavity equations (or the Belief Propaga- 
tion equations), we need to introduce the characteristic 
functions fy which impose such constraints over configu- 
rations of the independent variables (pi, di). For any edge 
(i,j) we have the indicator function fy = gijgji where 



On a fixed point, one can compute marginals ipj 
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Cavity Equations. 
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The cavity equations take the form 
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Qk^j (dj,Pj) oc 2J p k^j (dk,Pk) fjk (dk,Pk, d 3 ,Pj) (2) 
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where c^ is the weight of the link (i, j), with c^ = oo 
if i is a terminal. The oc symbol accounts for a mul- 
tiplicative normalization constant. Allowed configura- 
tions are weighted by e~^ Cii where /? _1 is a temperature 
fixing the energy level. The zero temperature limit is 
taken by considering the following change of variables: 
tpj^i (dj,pj) = 0- 1 log Pj^i {dj.pj) and <p k ^j (dj,Pj) = 
[3~ x logQfc_»j (dj,Pj). In the /3^oo limit Eq.[T][2] reduce 
to: 
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The previous two equalities must be understood to 
hold except for an additive constant. Eqs. [3j|4] are in 
the so called "Max Sum" form. 



^j (dj,Pj) = ~c m +22<t>k->j(dj,pj) 






(5) 



and the optimum tree should be given by arg max ipj . 

If the starting graph is a tree ipj-^(dj,Pj) can be inter- 
preted as the minimum cost change of removing a vertex 
j with forced configuration dj , pj from the subgraph with 
link (i,j) already removed. We introduce the variables 
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Pj = and Pj ^ k, respectively. Eqs. [3j4] can then 

be solved by repeated iteration of the following set of 

equations: 
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For graphs without cycles the above equations are 
guaranteed to converge to the optimal solution. In graphs 
with cycles, these equations may instead fail to converge 
in some cases. For the classes of random graphs stud- 
ied in this work, this appears not to be due to a replica 
symmetry breaking instability but rather to the effect of 
local structures in the underlying graph (as it is known to 
happen in simpler problems such as random matchings 
[la]). This observation is corroborated by the analysis of 
the distributional cavity equations discussed later. While 
more work is needed to understand this point, from the 
algorithmic viewpoint the problem can be overcome by 
applying a small perturbation [a]. The term ipj(dj,pj) 
of Eq. [5] multiplied by a (small) constant p is added to 
the rhs. of Eq. [3j This leads to a set of equations which 
show good convergence properties for vanishing p. 

An equivalent formulation of the problem can be con- 
structed by introducing a link representation of the 
pointer variables (one may introduce link variables x%j = 
0, ±1, if i does not point j, 1 if i points j and —1 if j 
points i). In this representation, the number of states of 
the independent variables is just 3D which can be kept 
finite for complete graphs or at most of order log N for 
sparse graphs. 



Distributional equations and average case 
Population dynamics (or density evolution) is a power- 
ful tool to solve distributional equations that deal with 
a large number of random variables. In the physics com- 
munity the method was introduced in [l7|] for the study of 
spin glass models on diluted random graphs. Population 
dynamics is useful especially when the equations involve 
sums over many states of the variables. The underly- 
ing idea is to represent probability distributions with a 
population of random variables and use the equations to 
update such populations. After a suitably large number 
of updates the histogram of variables in the population 
will converge to a stable distribution. 

To obtain results on the N — > oo limit one would need 
to rescale simultaneously all d-dependent quantities in or- 
der to eliminate their direct dependence on A~ in Eqs. [6J- 
flOl We limited however ourselves here for all cases an- 
alyzed to large but finite TV, in particular because the 
obviously needed dependence of D on N for finite degree 
graphs makes this task even more involved. 

We will apply the population dynamics method to find 
the statistical properties of the cavity fields M,_>j = 
(Af^^Bi^^Cf^^D^^Ef^) in Eqs. IMlOl Given an 
ensemble of random graphs we will find the probabil- 
ity distribution of these fields from which we will derive 
the quantities of interest, namely the average minimum 
cost and average number of Steiner nodes as a func- 
tion of N, in the so called Bethe approximation which 
is implicit in the cavity approach. The method pro- 
ceeds by initializing at random a population of field vec- 
tors M a = (A d a ,B a ,Ci,D a ,Et) with a G [0,N P ] and 
d G [0,1?]. The first member M represents messages 
sent by root. Members with label a = 1, . . . , N t repre- 
sent messages sent by terminal nodes. Here Nt = aN p 
where a = K/N is the fraction of terminal nodes. Then 
the population dynamics algorithm works by updating 
the population using Eqs. IMTOl until convergence is 
reached. For brevity, we omit the details of this pro- 
cedure. Once convergence is reached, marginals tp a (d,p) 
can be computed using Eq. [5j The state (d*,p*) that 
maximizes the local marginal gives the energy contribu- 
tion of the a — th member. If p* ^ and Nt < a, then 
a is a Steiner member. Finally the minimum cost reads 
E = Ke t + (N — K) e s where e t and e s are the average 
energy of terminal and Steiner members. The fraction of 
Steiner members in the population will give the fraction 
of Steiner nodes in the ensemble of random graphs. 

In Figures [T]|3] we display numerical results for three 
classes of random graphs, namely complete graphs, fi- 
nite connectivity random graphs and scale-free graphs. 
We first verify a quite remarkable agreement between 
the output of the algorithm which finds Steiner trees on 
given random instances with the outcomes of the popu- 
lation dynamics averaged over the randomness. In Figs 
1-2, we estimate the dependence on the depth D of the 
minimum cost and of the size of the Steiner set nodes. 
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Figure 1: Z?-MST on complete graphs. Left: Minimum cost 
(at a = 0.5) and fraction of Steiner nodes (for N = 8000) as 
a function of D. Right: Comparison of Pop. dyn. with the 
algorithm on single samples for various values of N at a — 0.5. 
Fits are in very good agreement with known bounds. 



For complete graph with random weights we are able to 
provide an accurate estimate of the scaling exponents 
which for a = 1 are compatible with rational exponents 
predicted by rigorous analysis [18|. Moreover, we ob- 
serve a very rapid decrease of the minimum cost with 
D, compatible with TV ' \ > . This suggests that very 
few "hops" (~ log log N) are indeed sufficient to reach 
optimal costs. From a qualitative point of view we ob- 
serve a non trivial dependence on TV and a of the size of 
the Steiner set. The size itself turns out to be sublinear, 
with a rational exponent that depends on D. For fixed N 
there appears a maximum for relatively small values of 
a. For the Scale-Free graphs there appears an additional 
cuspid-like minimum. Finally, in Fig. [3] we provide the 
probability distribution of optimal weights for all classes. 
We conclude this letter by mentioning the connection 
with rigorous results. For the case of bounded depth 
trees on complete graphs our numerical results show that 
the cavity equations are indeed consistent with known 
bounds. As discussed in [18|, the analysis of a simple 
greedy algorithm and a Chernoff-type bound lead to up- 
per and lower bounds for the minimum cost that are 
able to identify the exact scaling exponent and to give 
bounds for the pre-factors. More precisely, it can be 
shown that the average minimum Ed grows with the size 
as A 1 ^ 2 -1 ) . The case D = 2 and a = 1 is particularly 
easy to understand: the greedy algorithm amounts at 
choosing a first set of Ai nodes at depth 1 by selecting 
the Ai links with smallest weights. Successively the re- 
maining N — Ni nodes at depth 2 are connected to the 
first layer by choosing the smallest weight for each node. 
By optimizing over the size of N\ one finds for the aver- 
age minimum cost Ez = lA^ 1 ' 3 (a naive guess may give 
an exponent 1/2 instead of 1/3). Comparisons with the 
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Figure 2: Fixed degree (FD) and scale-free (SF) graphs. Left: 
Minimum cost as function of a for different values of D. 
Right: Fraction of Steiner nodes as a function of a. The 
FD graphs have degree C = 3 and size N = 10 6 . The SF 
graphs have exponent 7 = 3 and size N = 10 4 . 
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Figure 3: Weight distribution of the MST for complete graphs 
of size N = 8000 at a = 0.5. Inset: For FD graphs of degree 
C = 3 (N = 10 6 ) and SF graphs of exponent y = 3 (N = 10 4 ) 
with parameters D = 25, a = 0.5. 



cavity approach for small D show that indeed the ex- 
ponent is 1/ [2 D — l) as it should and that there exist 
a constant additional (negative) term to the minimum 
cost which improves over the greedy algorithm. Table U 
shows the results of a power law fit to our data for the 
average minimum cost and number of Steiner nodes as 
a function of N . For D = N — 1 and a = 1 it is possi- 
ble to prove using techniques based on the computation 
tree that if the BP equations converge, then the result is 
optimal. Details about these results and hopefully about 
their extensions to the a < 1 case will be given elsewhere. 



Work is in progress to apply the algorithmic scheme we 
have presented to clustering, network reconstruction and 
protein pathways identification problems. 
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Table I: Comparing the exponents and prefactors for complete 
graphs. The parameters have been obtained by fitting data to 
a + bx c . In all the data N < 8000. Values in the parenthesis 
are known analytical results. 
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